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We are concerned with the minimal residual method combined with polynomial preconditioning 
for solving large linear systems Ax - b with indefinite Hermitian coefficient matrices A. The 
standard approach for choosing the polynomial preconditioner leads to preconditioned systems 
which are postive definite. Here, we investigate a different strategy which leaves the 
preconditioned coefficient matrix indefinite. More precisely, the polynomial preconditioner is 
designed to cluster the positive, resp. negative eigenvalues of A around 1, resp. around some 
negative constant. In particular, it is shown that such indefinite polynomial preconditioners can be 
obtained as the optimal solutions of a certain two-parameter family of Chebyshev approximation 
problems. We establish some basic results for these approximation problems and sketch a Remez 
type algorithm for their numerical solution. The problem of selecting the parameters such that the 
resulting indefinite polynomial preconditioner speeds up the convergence of minimal residual 
method optimally is also addressed. For this task, we propose an approach based on the concept 
of asymptotic convergence factors. Finally, some numerical examples of indefinite polynomial 
preconditioners are given. 


Subject Classification: AMS(MOS): 65F10, 41A10, 65D15. 

Key words: Indefinite Hermitian matrices, minimal residual method, polynomial preconditioning, 
asymptotic convergence factor, Remez algorithm. 


flnstitut fur Angewandte, Mathematik und Statistik, Universitat Wurzburg, D-8700 Wurzburg, 
F.R.G., and the Research Institute for Advanced Computer Science, NASA Ames Research 
Center, Moffett Field, CA 94035. Work reported herein supported by Cooperative Agreement 
NCC2-387 between the National Aeronautics and Space Administration (NASA) and the 
Universities Space Research Association (USRA). 



1. Introduction 


Conjugate gradient type algorithms combined with preconditioning are among the most 
effective iterative procedures for solving large sparse nonsingular linear systems 

Ax = b. (1) 

In recent years, polynomial preconditioning has attracted much interest. The technique 
consists of selecting a polynomial s of small degree and then applying a conjugate gradient 
type method to one of the two linear systems 

s(A)Ax — s(A)b (2) 

(left preconditioning), or 

As(A)y = b, x = s(A)y (3) 

(right preconditioning). Remark, that (2) and (3) are both equivalent to the original 
linear system (1). Moreover, the systems (2) and (3) have the same coefficient matrix 
s(A)A = As{A). Clearly, the polynomial s should be chosen such that the conjugate 
gradient iteration for (2) resp. (3) converges as fast as possible. 

For the case of Hermitian positive definite A, the idea goes back to Rutishauser [24] 
who proposed polynomial preconditioning in the fifties as a remedy for roundoff in the 
classical conjugate gradient (CG hereafter) algorithm of Hestenes and Stiefel [17]. The 
recent revival [18] of Rutishauser’s method and the general interest in polynomial precon- 
ditioning is mainly motivated by the attractive features of this technique for vector and 
parallel computers (see [25] for a survey). 

In this note, we are concerned with polynomial preconditioning for linear systems (1) 
with Hermitian, but indefinite coefficient matrices A. An obvious strategy for the design 
of the preconditioner is to choose s such that s(A)A is as close as possible to the identity 
matrix J. This approach was studied in detail by Ashby [2] and Ashby, Manteuffel, and 
Saylor [3]. Note that the resulting preconditioned system (2) resp. (3) is then Hermitian 
positive definite and thus can be solved by the standard CG algorithm. 

The purpose of this paper is to document our study of a second preconditioning 
strategy which, in contrast to the first approach, leaves the preconditioned matrix s(A)A 
indefinite. Roughly speaking, s is chosen such that s(A)A is as close as possible to I 
on the positive part of the spectrum of A and as close as possible to /xJ, where /x € IR 
is some negative constant, on the negative part of the spectrum of A. In particular, we 
will show how polynomials s of this type can be obtained as solutions of a family of 
Chebyshev approximation problems depending on two paramaters, namely /x and a weight 
factor in € 1R. The question of how to choose these parameters in order to speed up the 
convergence of the iteration as much as possible will also be addressed. Finally, note that, 
since the resulting matrix s(A)A is now indefinite, the standard CG algorithm is no longer 
suitable for solving (2) resp. (3), and we use the minimal residual (MR hereafter) method 
instead. 

The paper is organized as follows. In Section 2, we recall some basic facts about 
the MR algorithm. In Section 3, matrices with spectrum symmetric to the origin are 
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considered, and it is shown that, in certain situations, the MR method is equivalent to 
the CG algorithm applied to the normal equations. In Section 4, we are concerned with 
the computation of the asymptotic convergence factor for the MR method based on the 
knowledge of two intervals which contain all eigenvalues of A. In Section 5, a two-parameter 
family of Chebyshev approximation problems is introduced, and some basic properties are 
listed. In Section 6, we consider indefinite polynomial preconditioners and show that there 
is an intimate connection with the class of approximation problems investigated in the 
previous section. A Remez type algorithm for the numerical solution of these problems is 
described in Section 7. Some numerical examples of indefinite polynomial preconditioners 
and their associated asymptotic convergence factors are presented in Section 8. Finally, 
we draw our conclusions in Section 9. 

Throughout this paper, A is assumed to be a nonsingula r Her mitian, but indefinite 
N x N matrix. <r(A) denotes the spectrum of A , and ||x|| 2 = Vx H x is the Euclidian norm 
of a; € C . Moreover, the notation II n will be used for the set of all complex polynomials 
of degree at most n. Finally, we denote by II the subclass which consists of all real 
polynomials in H n . 


2. The minimal residual algorithm. Error bounds 

Let xo € C N be any initial guess for the true solution A -1 6 of (1), and let ro — b — Ax o 
be the corresponding residual vector. Moreover, we denote by 

K n := span{r 0 , Ar 0 ,A 2 r 0 ,..., A" -1 !^} (4) 

the nth Krylov subspace of C N generated by t*o and A. Starting horn x 0 , the MR method 
generates a sequence of approximations x n , n = 1,2,..., to A~ l b which are uniquely 
defined by the minimal residual property 

||6-Ax n || 2 = min ||6-Ax|] 2 , x n e x 0 + K n . (5) 

i€io4“ An 

For Hermitian positive definite matrices A, the MR algorithm was introduced by Stiefel 
[26] as a variant of the classical CG method. However, the algorithm given in [26] may 
break down (see e.g. [5,10]) for indefinite Hermitian matrices. A stable implementation 
(algorithm MINRES in [22]) of the MR approach for indefinite Hermitian matrices was 
first devised by Paige and Saunders [22]. The main ingredient of MINRES is the celebrated 
Lanzcos algorithm [20]. 

Algorithm 1 (Lanczos). 

0) Set v 0 = 0, /3i = ||r 0 || 2 , v = r 0 . 

For n = 1,2,... 

1) If 0 n = 0, stop. 

Otherwise, compute 

2) v n = v/0 n , a n = v„ Av n , 
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t; = Av n - a„v„ - Pn+i = IMh • 


In the following proposition, some basic facts about the Lanczos algorithm and its con- 
nection with the MR method are listed. We refer the reader to [13, pp. 325] and [22] for 
proofs. The notations 


V n := (t>i,V 2 ,...,v n ) and 5„ 
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are used. Note that S n is an (n 4- 1) x n matrix. 

Proposition 1. 

a.) In exact arithmetic, Algorithm 1 stops for n = m + 1 where m := dim K^. 

b) The termination index m is equal to the minima/ number of components in any expan 

sion of r 0 into orthonormal eigenvectors of A, i.e. 


ro 


m 


= 51 p> u » 

j=i 


where pj > 0, Auj = A^Uj, A^ < A^ < • • • < A^ m ^, 


( 1 if j = k 
to ifj^k ' 


( 7 ) 


c) For n = 1, 2, . . . , m the nth iterate x n of the MR method is given by x n = x 0 4- V n y n 
where y„ is the solution of the linear system 


sHS„y = /3,S« 


/1\ 

0 

Vo } 


( 8 ) 


Moreover, in exact arithmetic x m = A 1 b. 

In MINRES, the MR iterates are computed via solving the linear system (8). This can 
be done very efficiently using a QR decomposition of S n . Furthermore, such a factorization 
of S n is readily obtained from the QR decomposition of S n -i in the previous step (see [22] 
for details). The resulting algorithm can be stated as follows. 

Algorithm 2 (MINRES implementation [22] of the MR method). 

0) Choose xq 6 and set v — b — Ax q, t>o = Po = P-i = 0, 

Pi=Vi = |M|„ c 0 = c_i = 1, s 0 = «_j = 0. 

For n = 1,2, . . . 
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1) If P n = 0, stop: x„_i solves Ax = b. 
Otherwise, compute 

2) v n = v/p n , a n = v%Av n , 

v = Av n - a n v n - P n v n - 1 , Pn+i = Nh, 

3) — &n—2Pni $n = *n-l®n 4* Cn— lCn — 2Pnj 
7n = C n _i Q n S n — \C n — 2Pn> 

"TV* = \J 7n 4" > C n — Tn/Tnj = Pn+l/lfnt 

4 ) Pn = (»n - $„Pn-l ~ C nP n -2)Hn , 

= ®n-l 4" T}nPn With T) n — C n f) n , 

*}n + 1 = &nfjn- 


Remark 1. The finite termination property of the Lanczos algorithm does no longer hold 
in the presence of roundoff error (see e.g. [13, pp. 332]), and the stopping criterion stated 
in Algorithm 2 is not useful in practice. Instead, one should terminate the iteration as 
soon as the norm ||r n || 2 of the residual vector r n = 6 — Ax n is sufficiently reduced. As 
Paige and Saunders [22] have pointed out, | |r„ [ [2 can be obtained without computing the 
vector r n itself by using the identity ||r n ||2 = P\&iS 2 ' ' ' *»»• 

Remark 2. Other numerically stable implementations of the MR method for Hermitian 
indefinite matrices were devised by Fletcher [9] and Chandra [5]. See also [10, 28] for 
further properties of the MR approach. Finally, we note that — as is typical for conjugate 
gradient type algorithms — there is an intimate connection between the MR method and 
orthogonal polynomials (see [ll]). 

For the choice of a suitable preconditioner for a conjugate gradient type algorithm, 
it is crucial to have error bounds for its iterates. Next, we state such estimates for the 
MR method. For this purpose, some information on the location of the eigenvalues of A is 
necessary. Here and in the sequel, we assume that two intervals [a, 6] and [c,<f] axe known 
such that 

<r(A) C [a, b] U [c, d] where c<d<0<a<b. (9) 

Note that, ideally, b resp. c would be the largest resp. smallest eigenvalue of A and a resp. 
d the smallest positive resp. largest negative eigenvalue. 

By the standard technique, expressing the Krylov subspace (4) K n = {g(A)r 0 | q G 
II n _i} in terms of polynomials and using the expansion (7) of t*o, one readily deduces from 
(5) the following result. 

Theorem 1. For n = 1, 2 . . . 


|| b— Ax„||2 


< E n (a,b,c,d) 


||6 — Asolh 

where E n (a, b , c, d) is the optimal value of the approximation problem 

E n (a,b,c,d):= min max |p(A)| . 

j>€n[,:p(0)=l *€[a,fc]U[c,d] 


( 10 ) 


(u) 


Note that the outlined derivation of the bound (10), actually leads to the complex 
version of (11) with II n instead of Iln^* Standard results (e.g. [21]) from approximation 
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theory guarantee that there always exists a unique optimal polynomial p* for this complex 
approximation problem- Moreover, it is easily verified (cf. [21, Theorem 27]) that p* is 
real, and therefore it is sufficient to consider only polynomials p € n„ in (11). 

Unfortunately, the solution of (11) is explicitly known only for special cases. For 
example, it is well known (see e.g. [2]) that for intervals of equal length b — a = d — c 
the optimal polynomials are suitably transformed Chebyshev polynomials. The solution 
of (11) is also known for a variety of other parameters a,b,c,d, and can be found in the 
classical work of Achieser [1] (see also Peherstorfer [23, Section 5]). For the general case, 
there is no closed expression for the optimal value E n (a, b, c, d) of (11). However, it is 
known that for n — ► oo this quantity behaves like K n where k = n(a,b,c,d) 6 (0,1). More 
precisely, it holds 

lim (E n (a,b, c, d)) 1/f " =: «(a,6,c,d) and 0 < n(a, b, c, d) < 1 (12) 

n— *oo 

(see Eiermann, Niethammer, and Varga [8], where this result is established for more general 
sets in the complex plane), k(o, 6, c, d) is usually called the asymptotic convergence factor. 
In Section 4, we will derive an explicit formula for k in terms of elliptic integrals. Based 
on this representation, k can be very easily computed numerically. 


3. A remark on matrices with symmetric spectrum 

The simplest way to obtain a conjugate gradient type method for linear systems Ax = b 
with arbitrary nonsingular coefficient matrix A, is to apply the standard CG algorithm 
to the Hermitian positive definte normal equations A H Ax = A H b. The drawback of this 
approach is that the condition number of A H A is the square of that of the original matrix 
A with the consequence that the resulting iteration will, in general, converge very slowly 
(see e.g. [27]). However, there are situations where working with the original system offers 
only little advantage over solving the normal equations or where the two approaches are 
even equivalent. Roughly speaking, this is the case if A has many eigenvalues in the right 
as well as in the left halfplane of C and/or if <?(A) exhibits certain symmetries. 

In this section, we are concerned with indefinite Hermitian matrices A with such a 
symmetrical spectrum. Since A = A H , the normal equations corresponding to the original 
system (1) assume the form 

A 2 x = Ah. (13) 

Next, we apply the standard CG algorithm [17] to (13) with x 0 € as initial guess. The 
resulting procedure — referred to as CGNE method in the sequel — generates a sequence 
of iterates x® GNE , Jt = 1,2,... which are characterized by the minimization property 

\\b-Axf JNE \\ 2 = inin ||6-A®|| 2 , x k € x 0 + K\. (14) 

x£xo+Kl 


Here 

K\ := span{Ar 0 , A 2 (Ar 0 ), A 4 (Ar 0 ),. . . , A 2 ^ _1 ^(Ar 0 )} 
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is the kth. Krylov subspace generated by Atq and A 2 . As in the previous section, we denote 
by x n , n — 1,2,..., the iterates produced by the MR algorithm applied to the original 
system Ax = b. It is assumed that the MR and CGNE methods are both started with the 
same initial guess xo- 

It turns out that the MR and CGNE approaches are equivalent whenever the eigen- 
values of A are symmetric to the origin, i.e. 

A € <?(A) implies — A € <?(A), (15) 

and the starting residual t*o has a “symmetric” expansion into eigenvectors of A. More 
precisely, we have the following 

Theorem 2. Let m, pj, X'*', j = l,...,m, be defined by the expansion (7) of tq and 
assume that l := m/2 E N and that 

A (i) = -A (m+1 “'\ pj — Pm+i-jt j — 1,2,... , Z, (16) 

holds. Then, for k = 1, 2, . . . , / 

*2 k = *2*+i = ^ < k° NE ■ 

Proof: Let u\ , . . . , u m be the eigenvectors of A which occur in the expansion (7) of ro. 
It is convenient to introduce the following notation. A vector v 6 is called even, resp. 
odd, if 


v = ^2 with ti = resp. ^ = -( m +i -j for i = 1,2,...,/. 

3-1 

Obviously, the following properties hold: 

(i) For any 7 € C, jv is even, resp. odd, if v is even, resp. odd. 

(ii) Av is even, resp. odd, if v is odd, resp. even. 

(iii) v h Av = 0 for any even or odd v. 

Next consider Algorithm 1 and let v n , n = 1, . . . ,m, be the Lanczos vectors. Clearly, r 0 
and therefore also Vj are even vectors. Using (i)-(iii), it follows by induction that v n is 
even, resp. odd, if its index n is odd, resp. even, and that 

a fl = 0 and /9 n+1 v n+ i = Av n - p„v n -i for n = 1,2, . . . , m. (17) 

The first identity in (17) and the definition of S n in (6) imply that the linear system (8) 
has the following structure: 
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Here, a “x” indicates a possible nonzero entry. By reordering the equations and the 
unknowns, a system of the type (18) can be transformed into one with the block structure 


(::)(?)-(!) 

where resp. j/ 2 ^ contains all components of y with odd resp. even index. Hence 
3/ 1 ) = 0 and, from part c) of Proposition 1, we deduce that 


*2*+i = Z2k G *o +span{r2,i;4,V6,...,t>2lfc}- (19) 

On the other hand, (17) implies that the subspace on the right-hand side of (19) is just 
the Krylov space K\ (note that Ar 0 = fi 2 f3i v 2 ). Thus x 2 Jfc € Xq + K\, and, in view of (5) 
and (14), it follows that x 2 k = x^ GNE . This concludes the proof of the theorem. ■ 


Remark 3. The eigenvalues of Hermitian matrices of the type 


A = 




where B is any px q matrix, 


always fulfill the symmetry condition (15) (see e.g. [13, pp. 285]). Moreover, it is easily 
verified that the remaining part of the assumption (16) is guaranteed if the starting residual 
is of the form 

r ° = (o)’ “ 6 CP ° r r ° = (v)’ V€C *’ 

Remark 4. In [10], it is shown that the assumption (16) implies a similar equivalence 
between CG applied to the “inner” normal equations A 2 y = 6, Ay = x and two other 
conjugate gradient type algorithms for Ax = fc, namely SYMMLQ [22] and Fridman’s 
method (see e.g. [28, 10]). 


4. Computation of the asymptotic convergence factor for two intervals 

In this section, we are concerned with the actual computation of the asymptotic con- 
vergence factor tc(a, b,c, d) defined in (12). As Eiermann, Li, and Varga [7] pointed out, 
asymptotic convergence factors — not only for the union of two real intervals, but for more 
general compact sets C C — can be expressed in terms of the Green’s function G(A; oo) 
(see e.g. [29, pp. 65]) for Cl c := C \ Cl with pole at infinity. Note that the existence of 
£?(A;oo) is guaranteed, if is of finite connectivity; moreover, the Green’s function is 
then uniquely defined by the following three properties: 

(i) G(-; oo) is a real harmonic function on C \ fi. 

(ii) There is a r 0 € H such that G(A; oo) - log |A| is harmonic for all A € C with |A| > r 0 . 

(iii) lim.x-,Ao G(A; oo) = 0 for all A 0 € dtl. 
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For ft = [a, 6] U [c, <f], the set Q c is doubly connected, and by applying the results from 
[7, Section 3] it follows that 

«(a, b,c, d ) = exp(— G(0; oo)). (20) 

Next, we use the connection (20) with the Green’s function to derive a representation of 
k (a, b, c, d) in terms of elliptic integrals. 

First, let C C be any doubly connected region with oo € fi c . Suppose we know a 
conformal mapping 

f : A r fi c , with f(A r ) — f2 c , (21) 

of some annulus 

A r := {z € C | r < |z| < 1}, with 0 < r < 1, (22) 

onto ft c . Moreover, it is assumed that 

r := / -1 ( oo) satisfies r < t < 1. (23) 

Note that (23) can always be achieved by a simple rotation in the z-plane. By means of /, 
the problem of finding the Green’s function for O c can be reduced to that of deter mini ng 
the Green’s function G r (z; r) for the annulus A r with pole at r. More precisely, the 
following identity 

G(A;co) = G r (/- 1 (A);r), A € ft c . (24) 

holds (e.g. [16, p. 259]). However, there are explicit representations for G r (z; r). Here we 
will use the following formula (see [16, pp. 259]): 


G r (z; 


r) = ( 


log I* 

logr 


l) logr — log 


E>i-o=, r>1 (-*/(”•))' 
E~-. ri\-rz/r)i 


(25) 


From now on, let fl := [a, b] U [c, <f]. So far, we have shown that, by means of (20), 
(24), and (25), the desired quantity k (a, b , c, d) can be expressed in terms of / -1 (0) where 
/ is a conformal mapping satisfying (21)-(23). Such functions f are explicitly known (see 
e.g. Kobers’s dictionary of conformal representations [19, pp. 191]) and are of the form 


/CO = fkA 0 := 


a — b 
2 


sn 2 (£- log z; 0 + sn 2 log t; k) q + t 

sn 2 (^-logz;fc) — sn 2 (^- log r; k) + 2 


(26) 


Here, w = sn (u;k) is the Jacobian elliptic function (see e.g. [14, pp. 904]) defined — via 
its inverse u = sn -1 (u>; k ) — by 


u = sn 1 (to; k) = f — . ^ ===== . (27) 

K J Jo y/(l ~ ^ 2 )(1 - k ^) V ^ 

The real number k is a parameter (the modulus of sn) with k £ [0, 1]. The number K 1 in 
(26) is not a free parameter, but depends on k : 

K' = I<\k) := d(f (= sn -1 (l; y/\ - ¥) . (28) 

yl - (1 — k 2 )sin 2 y 
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Similarly, we set 


K = K(k) := / . , (= sn-‘(l; *)) • (29) 

Jo v 1 “ * 2 sin 9? v y 

Note that sn(u; fc) is a doubly-periodic meromorphic function with periods AK and 2 iK' 
and poles at the points 2m K + (2 n + l)tiiT', m,n E Z. Finally, we remark that the branch 
of the logarithm in (26) is chosen such that 

log z — log \z\ + i arg z , —n < arg z < ir. 

Using standard techniques from complex analysis, it is readily verified that the func- 
tion (26) indeed maps an annulus A r of the type (22) conformally onto the complement of 
two disjoint real intervals. Here, the inner radius r of A r is given by 

r = r(*):=e*p(=j^). < 3 °> 

Moreover, the image of the outer boundary \z\ — 1 of A r under / is just the interval [a, 6]. 
Hence, it only remains to adjust the two free parameters k and r in (26) such that the 
inner boundary |z| = r of A r is mapped onto [c, <f]. This requirement leads to the two 
equations 


where we have set 


a — b 1+sn 2 (M;fc) a + b 

2 1 — sn 2 (M; k) 2 

a — b 1/k 2 + sn 2 (Af;fc) a + b 
2 1/k 2 — sn 2 (M; k) 2 


= c, 

(31a) 

= d. 

(316) 


(32) 


By solving first (31a) for sn 2 (M; k) and, subsequently, (31b) for k 2 , we obtain 


sn(M;l)=- v /^f and h = • (33) 

Note that, by (23) and (32), M < 0, and thus, in view of (27), also sn(M; k) < 0. By (33) 
and (32), the two free parameters k and r in (26) and the function / are now uniquely 
determined. By (20) and (24), we have 

«(a, 6, c, d) = exp(— G>(z 0 ; r )) where z o :=/ _1 (0). (34) 

Therefore, it remains to determine the solution zq of f(z) = 0. To this end, we set 

u 0 = — logz 0 or, equivalently, z 0 = ex p(~^f) * ( 35 ) 
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Using (26) and the first relation in (33), it follows that i*o Is the solution of 

sn(u 0 ;*) = \J ^ sn(M;k) = — . (36) 


Next, recall (e.g. [14, p. 914]) the identity 

sn(v + iK'; k) = - — 1 — — . 

k sn(v; k) 

By means of (37), (35), and (33), we deduce from (36) that 


(37) 


u 0 -v„ + „„ := - sn 1 k) , and *o = -«p(^). (38) 

Finally, using (34), (25) (with z = z 0 ), (30), (32), and (38), one arrives at the formula 

• (39) 

For the numerical evaluation of (39) it is advantageous to rewrite (39). To this end, let 

OO 

8(z, A) := ^2 exp(— ttA j 2 + 2izj) (40) 


( h ({ , v ° \ ^=-°° e3C p(-7T^xi 2 + fr(v 0 -M + K)2) 

K(a,6,c,d) = exp[(l + — ) — ) ) f 

oo exp ffjfrj 2 + ^r(vo + M + K)2j 


j=-oo 


be one of the theta functions (see e.g. [14, p. 921]). By means of (40), it follows from (39) 
that 


where 


( u j\ //, , v o\'xM\ 0(zi,A o ) 

*(<.,6,c,d) = «cp((l+ r ) Tr ) ^ 

= — i z\ = — ( v o —M + K), z 2 = — — {v Q + M + K). 


(41) 


K* 1 ~ x 2iK' K ' J "" ’ 2iK> 

A straightforward computation, using Jacobi’s identity (see e.g. [15, p.272]) 

*2 




with A = A 0 and z = Zj resp. z = z 2 , shows that the representation (41) is equivalent 
to the final formula (43) stated in the following theorem. Furthermore, by the variable 
transformation £ = w/y/T+1 in (27), we have expressed the elliptic integrals, which occur 
in (28), (29), (33), and (38), in terms of the standard form 


1 f dt 

R F (z y y,z):=- . ===== , *,y,z>0, 

2 Jo y/{i + z)(t + y)\t + z) 

of the ellpitic integral of the first kind. 


( 42 ) 
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Theorem 3. Let c<d<0<a<b. Then: 


K( “' b ' C’ d) = + W, S ):=l+2g(-l)V’cos(2«), (43) 


where 


9 = 6XP (-#-)’ fc = Y ( a-cKfe-rf) ’ * = 

= (44) 

_ [ ajb^d ) / d(a - jO c(a-6) \ 

V b(a-d) Rf V' b(a-d)' b(a-c))' 


The following corollary will follow readily from the representation (43) and (44) of the 
asymptotic convergence factor «. 


Corollary 1. 

a) /c(a, 6, c,d) is a continuous function on {(a, b, c, d) € 1R 4 |c<d<0<a<i}. 

b ) Let (a n } ne | V7 {6„} n€ jv, (c n ) n6 |V> and {d„} ng jy, be given convergent sequences with 
hmits a, b, c, and d, respectively. Moreover, assume that c n < d n < 0 < a n < b n for all 
n € IV and that c<d<0<a<6. If a = 0 and/or d = 0, then 


lim K(a n , 6„, c„, d n ) — 1. 

n— »oo 


Proof: First, note that all the operations in (43) and (44) are continuous as long as 
c<d<0<a<b holds, and part a) is obviously true. We now turn to the proof of 
part b). Let k „ , q^ n \ k^ n \..., denote the quantities in (43) and (44) evaluated at 
a n> b n , c n , d n . We need to check their behavior for n — ♦ oo. There are three cases, 
namely 

(i) d = 0 < a, 

(ii) d < 0 = a, 

(iii) d = 0 = a. 

In the cases (i) and (ii), the sequences q^ n \ k^ n \ K^ n \ . . . , converge for n — ► oo to 
finite limits q, k, K, , v 0 , respectively, and K > 0. Furthermore, v 0 = —K in case (i) 
and t>o = 0 in case (ii). Therefore, in view of (43), /c n converges to 1. Finally, consider the 
case (iii). Here k n converges to k = 0. By (29), (28), and (44), it follows that 


lim K n = 0, lim K’ = oo, and lim q n = 0. 

n— ► oo n— ► oo n— ♦ oo 


Using the definition of the theta function in (43), we deduce 


lim t? 4 (V’„,q n ) = 1 for all € 1R, 

n— ►oo 

and hence, by (43), lim,,-*,*, K n = 1. This concludes the proof of the corollary. ■ 
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Remark 5. The theta functions t9 4 and $ in (43) and (40), respectively, are connected 
through 

t? 4(V’> q) = 0(V> + ^r/2, -(log q)/i r), € R, 0 < 5 < 1. 

There are whole books filled with the numerous properties and idenities which hold 
for theta functions. In the sequel, we will make use of the relations 

i9 4 (7r/2,q) = y/2K/w (45) 


and 

OO 

q) = JJ (1 — 2q 2 ’~ 1 cos(2 V>) + q 2(2j-1) ) (l - q 2 ->) 

> =1 (46) 

> MO, q) — for all € R 

(see e.g. [14, pp. 921]). Here, q is defined in (44) with K = K(k ) and K' = K'(k) given 
by (29) and (28). 

By means of Theorem 3, the asymptotic convergence factor «(a, b, c, d ) can be very 
easily computed numerically. For the calculation of the integrals Rp of the type (42), 
which occur in (44), there are standard algorithms. For the numerical examples presented 
in Section 8, we have used a procedure due to Carlson [4, Algorit hm 1]. Finally, in (43), 
an infinite series needs to be computed twice. In the following, let ^ € R and J € IN. 
Moreover, suppressing the parameter q and the index 4, we set 

J 

*9(V0 := (V’i?) and i?^(V>) := 1 + +2^ (— 1 )*q J ' 3 cos(2 ij>j) . (47) 

3=1 

If J is chosen large enough, the finite series 1 will yield a sufficiently accurate ap- 
proximation to 1 We now derive a formula for such an integer J. Using (43), (47), 
(45), and the fact that 0 < q < 1, one obtains 


l<W - = 2 


52 (-l)V’ cos(2iAj) 


j=J+l 

OO 


<2 y, /= 2 S < -' +1) '52/ +2 « j+1) <2 S < j+1 )’52/ 

j— J+ i i=o j = 0 

= ? < ' ,+,) ’(1 + i94(x/ 2. 5 )) = g< J+1 >'(l + ^2Kp) . 


(48) 


With (47), (46), and (48), we arrive at the estimate 


MP) - 

d(xl>) 


< q 


(J+ 1, 1 + Mm) 

(1 - e) 11 * 


1/2 


(49) 
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From (49), it follows that the truncated series t approximates t?(V») with a relative 
error 




< 


if J is chosen as 


J := [t] where s := 


log(e(l - fc 2 ) 1/4 ) 4- log(l + (7r/(2JtT)) 1/2 ) 
log? 


,1/2 


Here, as usual, [t] denotes the integer part of t € IR. 

We conclude this section by stating the following proposition which follows as a special 
case of a more general result due to Eiennann, Li, and Varga [7, Proposition 3], This 
monotonicity property of the asymptotic convergence factor k will be used in Section 6. 


Proposition 2. Let c < c' < d! < d < 0 < a < a' < b' < 6 be given and assume that at 
least one of the inequalities “<” is strict. Then: 


K(a' ,b' ,c' ,<?) < K(a,b,c,d). 


5. A family of Chebyshev approximation problems 


As we will see in the next section, the task of finding an optimal polynomial preconditioner 
for Ax = b leads to a family of Chebyshev approximation problems. In this section, some 
results for such approximation problems are presented. 

In the following, it is assumed that S := [a, 6] U [c, d\ is the union of a positive and 
negative interval with arbitrary, but fixed endpoints c<d<0<a<b. Moreover, / € IN 
always denotes a positive integer. Finally, set 

T := {(/i,u>) € B. x IR | w > 0). 


We will study the following family of approximation problems depending on the two pa- 
rameters € T: 


:= 


min ||/ — As||u, , ||/ — As||u, := max |u>(A)(/(A) — As(A)) | , (50) 

A€S 


where 


w(A) 


ri if A > 0 
l w if A < 0 



if A > 0 
if A < 0 


(51) 


Remark 6. For the special case fi = w = 1, (50) reduces to the approximation problem 
(11) (with n replaced by /) which arose in Section 2 in connection -with error bounds for 
the MR method. 

(50) is a linear Chebyshev approximation problem: We seek to approximate /(A) by 
polynomials of the form As(A) 6 11^ in the weighted uniform norm || • || w . Note that 0 & S, 
and this guarantees that Haar’s condition is satisfied. Standard results (see e.g. [21]) from 
approximation theory show that there always exists a unique optimal polynomial for (50) 
which is characterized by an equioscillation property. We summarize these results for (50) 
in the following 
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Proposition 3. Let l £ IN and ( p , w) € T. Then: 

a) There exists a unique optimal polynomial s*(A; p, w) € Hjl\ for (50). 

b) s e njl\ is the optimal polynomial for (50) if, and only if, there exist l+l extremal 
points 


c<Xo<Xi <---<\ knt ,-i<d, a<X knei <A fcne , +1 <--<A ,<6 (52) 

of w(A) — Aa(A) and a number y € 1R such that 




(-1 ) J y for j = 0, 1, , k neg - 1 
(-1 ) 3 ~ l y for j = k neg ,k neg + 1,...,/ 


(53) 


Moreover, if s is optimal, then ji(p, w) = |y|. 

Here, a point A* € 5 is called an extremal point of / — As(A) if 

K**)(/(A*) - A*«(A*))| = ||/ - Aj||„ . 

The following corollary is a simple consequence of part b) of Proposition 3 . 

Corollary 2 . Let s*( A; p,w) be the optimal polynomial of (50). 

a) s* = 0 if, and only if, 1 = 1 and w = 1/p. 

b) Unless sf ^ 0, there are at least 1 + 1 and at most 1 + 3 extremal points of f — X s* in S. 
Moreover, at most 1—1 of these extremal points are contained in the interior (a, b ) U (c, d) 
of S. 

Proof: By using (52) and (53), one readily verifies part a). 

We now turn to part b). First, note that, by part b) of Proposition 3 , / — Xsj has at least 
1 + 1 extremal points in S. Next, recall (cf. (51)) that / is constant for A > 0 and A < 0 , 
respectively. Hence 


(/(A) - Xs*(X; p, w))' = - (As*(A; p, w))' =: p( A) for all A ^ 0 . ( 54 ) 

Now assume that s* 0 . Then p is a polynomial of degree not exceeding / — 1 and 
p ^ 0 . This shows that p has at most / — 1 zeros. On the other hand, in view of ( 54 ), 
p(Xj) — 0 for all extremal points A j € S \ {a, 6 , c, d] of / — As*, and thus there are at most 
/ — 1 such “inner” extremal points A j. Therefore, altogether, there can not be more than 
/ — l + 4 = / + 3 extremal points in S. I 


In the next section, we will also make use of the fact that the optimal value of (50) 
depends continuously on the parameters p and w. 

Lemma. Let 1 € IN. Then, the optimal value ji{p,w) of (50) is a continuous /unction of 

{p,w) € r. 

We remark that, for w fixed, it follows from a standard result (see e.g. [30, Lemma 13.1]) 
in Chebyshev approximation theory that 7 i(p,w) is a continuous function of p. The proof 
given in [30] is easily adapted to the family of approximation problems (50). 
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Proof of the Lemma: Let /ii» /12 G IR, w\ , W2 > 0 be arbitrary, and denote by 
fi, o>2 the associated functions (51). Furthermore, assume that / € IN is fixed, and 

let sj and be the optimal polynomials of (50) corresponding to and (fi2,W2), 

respectively. By using the optimality of , the triangle inequality, and the obvious fact 
that || ■ | | Wl < max{l,twi/u2}|| • ||u> 2 , we obtain the estimates 

7(0*1, «>i) = | |/i - AsJ|| Wl < | |/i - As£|| Wl 

< I l/i - /2IU1 + II/2 - AiJIlwi (55) 

< ^1 |/*i — M2I + max{l, W 1 /W 2 } 7 /(a* 2,W2). 

With 7 i(a* 2, ^2) < H/alU = max{l, |/x 2 |tr>2 } » it follows from (55) that 

7((Mi, u, i) — 7((/*2, ^2) < w i |/*i - A*2 1 + max{0,(tO! — w 2 )/w 2 } max{l, I/12IW3}. (56) 

Obviously, we may exchange the parameters (mi,wi) and (fa, ^2) in (56). Therefore, (56) 
leads to the inequality 

|7((/*l,Wi) - 7((/*2,W2)| 

< Imi _ M2 1 max{ui,u;2} + |u>i — W2I max{l/u>i, l/to 2 } max{l, |mi|^i, |M2|^2} 
which implies the continuity of 7i(/z, w). B 

Remark 7. In general, is not differentiable. Typically, differentiability gets lost 

when the number k ntg = k neg (fi,w) of negative extremal points in (52) and (53) changes. 
The following example illustrates this behavior. Let / = 2, S = [1,3] U [-2,-1], and 
M = —2 be fixed. It is straightforward to verify, by means of part b) of Proposition 3, that 
the best polynomial a*(A; w) and corresponding optimal value t(u>) of (50) are given by 

«*(A; w) = - 4 ? 7(u>) = i, if 0 < w < 0.1, 


and by 




2(2 - A/Q 

( + 2-1/C 


7(«>) 


( + 2-1/C 


with £ = 1 + 2 




3 w 


l + 2u>’ 


if 0.1 < iv < wq- 


Here u>o is the unique root of 4w 2 — 188u> + 49 = 0 in the interval (0, 1). Moreover, 
the extremal points are 1,2,3, if 0 < tv < 0.1, —2, 1,2, 3, if w = 0.1, and — 2, 1,£, if 
0.1 < w < wo. Obviously, 'y(w) is a differentiable function of w for 0 < u; < wo, w 0.1, 
but, since 

lim 7 , (u) = 0 and lim 7 f (u>) = 100/147, 
u,— 0.1-0 v 7 u,— 0.1+0 v 7 ' 

7(10) is not differentiable for w = 0.1. 
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6. Indefinite polynomial preconditioners 

In this section, we return to the polynomial preconditioned MR method for solving the 
linear system (1), Ax = b. In particular, the question of how to choose an appropriate 
polynomial s for the preconditioned systems (2) resp. (3) is addressed. 

As in Section 2, it is always assumed that A is a given indefinite Hermitian N x N 
matrix and that a, 6, c, d € 1R are known such that 

<r(A) C S := [a, 6] U [c, d] where c < d <0 < a <6 (57) 

(cf. (9)). In this paper, we will not consider the problem of how to actually obtain such 
bounds. The reader is referred to [3,11] where some results regarding this question can be 
found. 

First, we note that the coefficient matrix As(A) of the preconditioned systems (2) or 
(3) is Hermitian if, and only if, s is a real polynomial. Therefore, in the following, it is 
always assumed that s £ II where I 6 IN is an arbitrary, but fixed integer. Furthermore, 
in order to guarantee that As(A) is nonsingular, we require that s(X) ^ 0 for all X € S. 
Since s is continuous and in view of (57), this condition implies that there are essentially 
two different cases: Either 


As(A) >0 for all A € S, (58) 

or 

As(A) >0 for all A € [a, 6], and As(A) <0 for all A e [c, d]. (59) 

Clearly, also the two cases with reversed inequalities may occur, but these can always be 
reduced to (58) resp. (59) by replacing s by — a. 

If (58) is satisfied, then, by (57), the preconditioned matrix As(A) is positive de fini te. 
For the case (58), the standard strategy for the choice of the polynomial s is to require 
that As (A) approximates the constant function 1 as close as possible on 5. Here, closeness 
is measured in the uniform norm on 5, i.e. s is given as the optimal solution of the 
approximation problem (50), (51) with p = w = 1. This case was studied in detail by 
Ashby [2] and Ashby, Manteuffel, and Saylor [3]. 

If (59) holds, then, in view of (57), the preconditioned system re mains indefinite, and 
we will use the 

Definition 1. A polynomial s € H/I.\ is called an indefinite polynomial preconditioner 
for Ax = b if (59) is satisfied. 

In the following, we will investigate indefinite polynomial preconditioners and, in particu- 
lar, develop a strategy for an optimal choice of s. 

From now on, it is always assumed that s satisfies (59). The criterion for selecting 
the preconditioner which we will propose here is based only on properties of the coefficient 
matrix As(A), and hence is the same for left and right polynomial preconditioning (2) and 
(3). For simplicity, we will consider only the approach (3) in the sequel. More precisely, 


17 



let *o € C N be any initial guess for the solution of Ax = 6, and let y n , n — 1,2,..., be tbe 
sequence of iterates generated by the MR method (Algorithm 2) applied to 

As{A)y = b — Ax o (=: ro), with starting vector yo := 0. (60) 

The iterates and residual vectors corresponding to the original system Ax = b are then 
given by 

X n = *0 + *(A)y n and r n =b~ Ax n = r 0 - Aa(A)y n , (61) 

respectively. Notice that only the iterates y„ axe updated at each step. The corresponding 
approximate solution x n of Ax = b needs to be computed only once, namely in the very 
last step of the algorithm. Furthermore, we remark that, in view of (61), the residual 
vectors of y n ( with respect to (60)) and of x n (with respect to Ax = b) are identical. This 
is a slight advantage of right polynomial preconditioning over the approach (2). 

Next, using the results from Section 2, we state error bounds for the preconditioned 
MR method. Setting 


5 := min As(A), E := max As(A), e := min Aa(A), 3 := max As (A), (62) 

*e[a,6] Ae[o,6] A€[c,<£] A€[c ,d] V J V ’ 

it follows from (57) and (59) that 

a{As{A)) C S := [a,E] U [c, <J] and c< 3 < 0<5<E. (63) 

Obviously, the numbers defined in (62) depend on s, and we will indicate this, if necessary, 
by writing 5 (a), E(s), c(a), 3(a). Then, in view of (63), Theorem 1 yields the estimates 


||E Ax w | [2 p /. T , T\ __io 

||6-Ax 0 || 2 - En(a,b,t,3), n-1,2,.... 

Furthermore, by (12), the error bound in (64) behaves like 

E n (a,b, c,3) « (/c(a,E, c, <?))", for n large. 


(64) 


(65) 


Therefore, (64) and (65) suggest the following notion of an optimal indefinite polynomial 
preconditioner. 

Definition 2. An indefinite polynomial preconditioner a * £ II ^ is called opt imal if 

n(a*)< k (s) (66) 

for all indefinite polynomial preconditioners a £ II ^ . Here, and in tbe sequel, 


k(s) := K(a(a),l(a),c(a),3(a)). 

Finally, we get to the promised connection between indefinite polynomial precondition- 
ers and the family of approximation problems (50). Let (p, w) £ T and s*(A) := a*( A; p, w) 
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the corresponding optimal polynomial for (50). First, we characterize those cases where s* 
yields an indefinite preconditioner. With (50) and (51), it follows that the numbers (62) 
associated with s* are 


a(s*) = 1 - 7 i(n, w), l(s*) = 1 + 7/(/i, w), 

*.*)-#• J(s‘) = M + 7i( "’ w) 


(67) 


w 


w 


In view of (59), (62), and (67), s * is an indefinite polynomial preconditioner if, and only 
if, € T/. Here, we have set 


F/ := {(/*, w) € 1R x 1R | w > 0, 7 i(p, w ) < 1, and p < —7 j(p, io)/io}. (68) 

Moreover, by (67), if a* is an indefinite polynomial preconditioner, then 

*(•*) = 9i{v,w) := k( 1 - 7/(/i, w), 1 + 7/(/x, ia), /i — w ) ). (69) 

u> w 

Notice that <7/(/i, u>) is a well-defined function for (/*, to) 6 IV 

After all these preli minari es, we can now state the main result of this section in the 
following form. 

Theorem 4. Let l e IN. 

a,) Let s 6 be an indefinite polynomial preconditioner, 5, 5, g, <J the corresponding 
numbers defined in (62), and set 


A»i 


J+ g 
6 + 3 


and u>i 


5—3 
J^g ' 


(70) 


Then, the optimal polynomial s*(A) := a*(A; /*i,u>i) 0/ (50) with parameters p\ and 117 is 
an indefinite polynomial preconditioner and, unless s* = s, 


k(s*) < k(s). 

b) There exist parameters p 0 and w 0 such that 


(71) 


9i(vo,u>o) = , min gi(p,w), (p 0 ,w 0 ) € T/. (72) 

c) Let po and ti>o satisfy (72). Then, the optimal polynomial s*(X; p Q ,wo) of the ap- 
proximation problem (50) with parameters po and wo is an optimal indefinite polynomial 
preconditioner. 

Proof: First, we prove part a). Let s € be an indefinite polynomial preconditioner, 
and hence, by (63), g<J<0<3<5. Moreover, by replacing s by (2/(a + 6 ))s, we may 
assume without loss of generality that 


a + 5 = 2. 


(73) 
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Note that this does not change the asymptotic convergence factor k(s) associated with 
*. Indeed, it is easily verified that «(s) = tcfata) for all a € 1R \ {0} and all indefinite 
polynomial preconditioners s. Now, by using (62), (70), and (73), we obtain 


max |1 - MA)| = .max |ti>i (m - Xa(X)) | = 

Afc[c,aJ 



In view of (50) and (51), we conclude from (74) that 


7 := 7i(/ii,n>i) < - - - with “ = ” holding iff s = s*. 
£ 

With (67), (70), and (73), it follows from (75) that 

c(a) < c(s*) < J(a*) < J(a) < 0 < a(s) < a(s*) < l(s*) < l(s) 


(74) 


(75) 


(76) 


where, unless s = a*, at least one of the inequalities “<” is strict. In particular, (76) 
shows that a* is an indefinite polynomial preconditioner. Moreover, by Proposition 2, (76) 
implies (71). 

We now turn to the proof of part b). In view of (69), (68), part a) of Corollary 1 (see 
Section 4), and the Lemma proved in Section 5, the function tv) is continuous on Tj. 
Furthermore, by (12), (68), and (69), 


gt(n,w) < 1 for all (77) 

Next, remark that, by (68), the boundary dTi of T; is given by 

dTi = {(/x, u;)elRx]R|u;> 0, 1 — 7 /(/x, tv) — 0, and/or /i + 7 /(/i, tv)/tv = 0}. (78) 

By means of (69), (78), and part b) of Corollary, we conclude that 

, s Km gi(^tv) = 1 for all (/i,u>) G 8T t . (79) 

(h,w)€Ti 


From (77), (79), and the continuity of gi, it follows that gi attains its minimum on Tj, i.e. 
(72) holds true. 

Finally, in view of (69), (66), and (71), the statement of part c) is an immediate consequence 
of part a) of this theorem. I 


By means of part c) of Theorem 4, an optimal indefinite polynomial preconditioner 
can be constructed via the numerical solution of approximation problems of the form (50). 
In the next section, we sketch an algorithm for this task. 
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7. A Remez type algorithm 


The standard tool for the numerical solution of general real linear Chebyshev approxima- 
tion problems is the method of Remez (see e.g. [21, pp. 105] or [30, pp. 163])- For the case 
H = w = 1, de Boor and Rice [6] devised a Remez type procedure for the approximation 
problem (50) which exploits the special structure of (50). In this section, we outline an 
extension of their algorithm for the general family (50). 

Let / € IN, p G 1R, and w > 0 be given. We Me concerned with the approximation 
problem (50) where the functions / and u Me defined in (51) and 5 = [a, 6] U [c, d] is the 
union of two intervals with endpoints c<d<0<a<b. In the following, let a € II 
be any candidate for the optimal polynomial a * of (50). It will be convenient to introduce 
the so-called residual polynomial 


p(X) = p(\)s) := 1 - As(A) 


(80) 


corresponding to a. Note that, by (51) and (80), 


«(A)(/( A)-A,(A)) 


( p(A) if A > 0 

| w(fi — l+p(A)) if A < 0 


(81) 


The Remez type procedure for the numerical solution of (50) is based on the equioscil- 
lation property stated in part b) of Proposition 3: We seek a polynomial a € 
with / + 1 extremal points (52) such that (53) holds for some number y 6 ]R. For any 
k neg = 0, 1, . . . , 1, denote by 


:= (A = (A 0 , Ai, . . . , A/) | c < A 0 < • • • < Xk n . t -i <d, a < A < • • • < A i <b} 

the set of all possible Xj for which (52) holds. By means of A* B€# , we can parametrize all 
the polynomials a which fulfill (53). 

Proposition 4. To each A = (Ao, Aj , . . . , A/) € A * B## , k ntg = 0, 1 there is a unique 

polynomial s(-; A) € II|l\ and a unique number y( A) € 2R such that (53) holds true. 
Moreover, s(-; A) is defined via 


1 - As(A; A) 


kn eg 1 l 

£ a - m+ -(A) + y , (-ly-'w^A), 


j=0 


j — &nef 


/ 


H A) n 

»=0 


A- Ai 
Xj — A, ’ 


(82) 


and y = y(A) is given by 

_ = l+tM-iis^yU/o) 


(83) 
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Proof: By means of (81), we rewrite (53) in the form 


ri-„ + (-iyy/«> f° r 3 — 0,l,...,fc»e* — 1 
\ (— l) J-1 y for j = k neg ,k neg + 


( 84 ) 


where p is the residual polynomial (80) corresponding to «(•; A). Here y € H is still a free 
parameter. By the Lagrange interpolation formula, for any fixed y 6 H, there is a unique 
polynomial p € Hj which satisfies (84), and p is given by (82). It remains to determine 

y. For this purpose, we remark that p is the residual polynomial (80) of «(•; A) € H|_j 
iff p(0) = 1. Using the Lagrange representation (82) of p, it is readily verified that this 
condition is equivalent to the formula for y in (83). Finally, we notice that — as is easily 
checked by means of (52) and the definition of Lj in (82) — all the terms in the sums 
of the numerator of the right-hand side of (83) have the same sign, namely (— 1)*»«* -1 . 
Hence this numerator is never zero and y is well defined by (83). ■ 


In the sequel, we will use the notation 

<(AiA)*uKA)(/(A)-A,(A;A)) (85) 

for the error function corresponding to «(•; A). Now, in view of Proposition 4 and part b) of 
Proposition 3, the approximation problem (50) is reduced to the task of finding the unique 
vector A* € A**^ , with k* tg 6 {0,1,...,/}, whose components A* are indeed extremal 
points of /(A) — As(A; A*). By (53) and (85), this last requirement is equivalent to 

( |c(A*;A*)| = ) |y(A*)| = max|e(A; A*)|. (86) 

Furthermore, note that for all A = (Ao, . . . , A;) € A * n4# and k neg € {0, 1, . . . , /} 

(|e(A,;A)|=) |jf(A)| < |y(A*)| = < max|e(A; A)|, if A # A*. (87) 

The optimal vector A* which satisfies (86) can be computed by a Remez type iteration. 
We now describe a typical step of such a procedure. After, say n, iterations, the algorithm 
has generated an approximation A := A* n) € A where k ntg := k$ g , to A*. Unless 
A = A*, one constructs the next iterate A := A^ n+1 ^ as follows. In view of (86) and (87), 
the choice of the elements A j of A aims at fulfilling 

|e(Aj-; A)| « max|c(A;A)|, j = 0,1, . . . ,/, 

as good as possible. In order to achieve this, one first computes A^ which correspond to 
some approxi ma te local maxima of |e( A; A) | near A j under the additional constraint that 

A' := (Aq, Aj , . . . , A/) € At n<# . 
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Next, it is checked whether any of the endpoints A e € {a, b, c, d} of S satisfies |e(A e ; A)| > 
|y(A)| and can be exchanged with one of the elements, say X'j o of A' such that 

A := A' U {A,} \ {A'.> € A U , 

This procedure guarantees that A approximates (86) better than A in the sense (cf. (87)) 
that 

|y(A)| < |y(A)| < |y(A*)|. 

The iterates A^"' of such a Remez procedure can be expected to converge quadratically 
to A*. Here, we will not give a convergence proof and, instead, refer to the approximation 
theory literature (see [21, 30] and the references therein) for a more general study of Remez 
type methods. 

The outlined Remez procedure for the approximation problem (50) can be summarized 
as follows. 

Algorithm 3 (Sketch of a Remez procedure for (50)). 

0) Choose k neg € {0, 1, . . . , /} (e.g. k ntg = [/(d - c)/(b -a + d- c)]) 
and A = (A 0 ,...,A f ) € A* ne ,. 

Repeat steps 1) through 4) until convergence: 

1) Using (83) and (82), compute y := y(A) and some representation (e.g. Newton 
interpolation) of the residual polynomial p corresponding to s(-; A). 

2) For j = 0,1, ...,/: 

Set 

c j(A) :=(signe(Ajf; A))«(A; A) (88) 

and compute A '• such that: 

(i) Cj(A' ) approximates some local maximum of Cj( A) near A j, 

(ii) A' := (Aq, Aj, . . . , A{) £ A* ne# . 

3) Compute A := (A<>, Ai, . . . , A/) as follows: 

(i) Compute T) 0 = c(c; A) and rji = e(A 0 ; A). Ifk neg = 0, set rji — —tji. 

If Vom < -y 2 : 

Set A 0 = c, Xj = AJ_, for j = 1, . . . , /, k ntg = k ntg + 1, 
and go to 4 ). 

(ii) Compute tj 0 = e(d; A) and m = e(X tnmf ;A). Ifk neg = 0, set m = ~Vi- 
II VoWi < -y 2 : 

Set Xj = Xj for j = 0, . . . , k neg - 1, A fc „. # = d , 

kntg = k n tg 4r Aj • = A j for j — k n tgi • • • * 
and go to 4). 

(iii) Compute rjo = e(a;A), and, 

if kntg < l, m = c(Ajk„„;A), resp., ifk ne g = l + l,m = -«(Aj; A). 

If mm < -y 2 : 

Set Xj — Xj for j — 0, . . . , k neg — 2, k ntg — k neg 1, 

= = A^ for j — - k n eg "t" 

and go to 4). 
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( iv ) Compute rjo = c(6; A), and 171 = e(A/; A). Ifk neg = 1 + 1, set 171 = —171. 

if mm < - y 2 - 

Set A j = Aj +1 for j = 0, — 1, 

A l — b, k neg = kneg 1 , 
and go to 4). 

(v) Set A - = A , k neg = k neg . 

4) Set A := A, fc ney := max{min{£ n ef* /}, 0} , and go to 2). 


Remark 8. Practical procedures for computing the approximate local max i ma in step 2) 
of Algorithm 3 can be found in [12]. For the numerical examples presented in the next 
section, we have used quadratic interpolation. E.g. for an interior point A j € (a, b ) U (c,d) 
one proceeds as follows. Let &> •'= A j and ( := A J+1 (resp. Xj-i) if e'(£ 0 ) > 0 (resp. < 0). 
Then, set & = £ except for the case that j = l (resp. j = 0) or for the case that £ and £i 
are not contained in the same interval of S = [a, 6] U [c, d]. In both cases, we choose as 
the endpoint of S which lies between £o and £. Next, the function tj is interpolated by the 
quadratic q defined by q(( 0 ) = e_,-(£ 0 ), q(Ci) = and g'(^ 0 ) = If q attains its 

maximum, say at ^*, we repeat the whole process a second time with := £*. The new 
resulting ^*, if it exists, is our canditate for A^-. If one of the two quadratic interpolations 
fails or if the resulting A' would not satisfy (52), a crude local search for the ma xi mum 
near Xj is applied, based on simply evaluating e>-( A) for several A « A j and determining 
the largest value. 

Remark 9. In view of (81), the error function (85), e, and the residual polynomial p 
defined in (82) are connected through 


<A;A) 


j p(X) if A > 0 

| w(p - l+p(A)) if A<0 


(89) 


In particular, the function (88), ej, and its derivative can easily be computed via (89) and 
some representation of the polynomial p. 


8. Numerical examples 

Based on the connection with the family of approximation problems (50), we have com- 
puted indefinite polynomial preconditioners in a number of cases. For the solution of (50), 
the Remez procedure described in the previous section was used. Optimal indefinite poly- 
nomial preconditioners were computed by solving the unconstrained optimization problem 
(72) numerically. Recall (cf. Remark 7 in Section 5) that the function gi in (72) is continu- 
ous, but only piecewise differentiable. The numerical evaluation of asymptotic convergence 
factors was done as outlined in Section 4. 

In the sequel, we present the results of a typical example. The set S is given by 

S := [a, 6] U [c, d] with a = 0.01, b — 0.99, c = —0.59, d = —0.1. (90) 
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The asymptotic convergence factor, which corresponds to no preconditioning, is 


n(a, b, c, d ) = 0.9590 . . . . 


For l = 2, 3, ... , 10, we have computed indefinite polynomial preconditioners via solving 
(50) with the following parameters. The first choice 

H- 1 = —1 and iv-i = 1 (91) 


aims at clustering the positive and negative eigenvalues of As(A) uniformly around 1 and 
— 1, respectively. The resulting asymptotic convergence factor is denoted by /c_j in the 
table below. A second obvious strategy is to choose the parameters in (50) such that 
the two intervals (63), S := [a, 5] U [c, J) containing the eigenvalues of the preconditioned 
coefficient matrix As(A) have the same relative length and position as the original intervals 
[a, 6] U [c, d], i.e. 

6 + a 5+a b — a J — a 

— = -= and — = -s . 

d+c d+Z d — c d—Z 

It is readily verified that (92), is fulfilled for the parameters 



d+c , b—a 

Hi = j— — and u>! = 

b + a d — c 

(cf. part a) of Theorem 4). The resulting asymptotic convergence factor for this choice 
will be denoted by k\. Note that for the example (90) considered here 


Hi = —0.69 and Wi = 2. (93) 

Finally, via part c) of Theorem 4, we have also computed the optimal asymptotic conver- 
gence factor K opt and the corresponding parameters Hopt, Wopt of (50). The following table 
lists the results for the three different strategies. 


Table 1 


These results are quite typical for the numerical experiments which we have performed. 
In particular, they show that the simple strategy (91) leads to very poor convergence rates, 
in particular as l increases. The second strategy leads to better results, but is still by far 
inferior to the best possible choice. Also notice that the optimal parameters Hopt and w opt 
exhibit a rather erratic behavior as l increases. 

The following two plots show, for two cases, the polynomials As* 0 (A) corresponding 
to the indefinite preconditioned coefficient matrix As(A). Here 8i 0 (X) denotes the optimal 
polynomial of (50) with l = 10. For Figure 1, the parameters (93), Hi = —0.69 and wi = 2, 
were used. Figure 2 corresponds again to h — /*i, but with increased weight w = 10. 


Figure 1 
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Figure 2 


Finally, the last two plots show the surface of the function l/gi(fi,w) (cf. (69)) whose 
maximum, in view of part c) of Theorem 4, corresponds to an optimal indefinite polynomial 
preconditioner. For the plots we have set gi(fi,w) = 1 if (/x, w) g T/ (cf. (68)). In both 
cases, the left comer is the point (g., w) = (0,0). The axis pointing towards the reader is 
the /z-axis. Figure 3 displays the results for 7 = 3 and Figure 4 for 7 = 10. 


Figure 3 


Figure 4 


9. Conclusions 

We have investigated polynomial preconditioners for Hermitian indefinite linear systems 
which lead to indefinite preconditioned coefficient matrices. In particular, it was shown that 
such polynomials can be obtained via the solution of a two-parameter family of Chebyshev 
approximation problems. Based on the concept of asymptotic convergence factors, we 
have introduced the notion of an optimal indefinite polynomial preconditioner. In order 
to actually compute such an optimal preconditioner, one needs to minimize a continuous, 
but only piecewise differentiable function of two variables. Moreover, each evaluation of 
this function requires the solution of an approximation problem of the type (50). A Remez 
type procedure for the numerical solution of (50) was outlined. A few numerical examples 
of indefinite polynomial preconditioners were presented. In a forthcoming technical report, 
we will report on numerical tests for the minimal residual algorithm combined with the 
indefinite preconditioners developed in this paper and compare this approach with other 
preconditioning strategies for indefinite Hermitian matrices. 
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1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

K-l 

0.986 

0.974 

0.962 

0.957 

0.937 

0.937 

0.922 

0.915 

0.906 

« 1 

0.959 

0.936 

0.932 

0.918 

0.908 

0.902 

0.886 

0.885 

0.869 

& opt 

0.948 

0.905 

0.874 

0.859 

0.821 

0.820 

0.776 

0.752 

0.734 

ftopt 

-1.92 

-0.68 

-2.37 

-0.68 

-1.92 

-1.96 

-1.68 

-3.78 

-1.60 

Wopt 

0.65 

3.40 

0.74 

5.80 

1.45 

1.40 

2.70 

0.76 

4.40 


Table 1 
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lambda*s(lambda) 
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lambda*s(lambda) 


1=10, w=10 



lambda 


Figure 2 
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Figure 3 
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Figure 4 
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